Web archives and the evolution of the digital economy
In collaboration with Andre Carrascal Incera & George Willis [work in progress]
\[trade_{ijt} \sim hyperlinks_{ijt} + distance_{ij} + \\ pop.density_{it} + pop.density_{it} + empl_{it} + empl_{jt}\]
\[\begin{align} R^2 = 1 - \frac{\sum_{k} (y_{k} - \hat{y_{k}})^2} {\sum_{k} (y_{k} - \overline{y_{k}})^2} \label{eq:rsquared} \end{align}\]
\[\begin{align} MAE = \frac{1}{N} \sum_{k = 1}^{N} |\hat{y_{k}} - y_{k}| \label{eq:mae} \end{align}\]
\[\begin{align} RMSE = \sqrt{\frac{\sum_{k = 1}^{N} (\hat{y_{k}} - y_{k})^2} {N}} \label{eq:rmse} \end{align}\]
| level | freq | perc | cumfreq | cumperc |
|---|---|---|---|---|
| (0,1] | 41596 | 0.718 | 41596 | 0.718 |
| (1,2] | 6451 | 0.111 | 48047 | 0.830 |
| (2,10] | 6163 | 0.106 | 54210 | 0.936 |
| (10,100] | 2975 | 0.051 | 57185 | 0.988 |
| (100,1000] | 646 | 0.011 | 57831 | 0.999 |
| (1000,10000] | 62 | 0.001 | 57893 | 1.000 |
| (10000,100000] | 4 | 0.000 | 57897 | 1.000 |
| year | hyperlinks | distance |
|---|---|---|
| 2000 | 0.539 | -0.219 |
| 2001 | 0.578 | -0.221 |
| 2002 | 0.793 | -0.221 |
| 2003 | 0.483 | -0.220 |
| 2004 | 0.807 | -0.223 |
| 2005 | 0.643 | -0.219 |
| 2006 | 0.585 | -0.219 |
| 2007 | 0.598 | -0.214 |
| 2008 | 0.491 | -0.205 |
| 2009 | 0.922 | -0.207 |
| 2010 | 0.674 | -0.205 |
| year | RMSE | Rsquared | MAE |
|---|---|---|---|
| 2002 | 951.04 | 0.96 | 166.99 |
| 2003 | 1254.95 | 0.94 | 230.47 |
| 2004 | 1019.69 | 0.95 | 179.42 |
| 2005 | 1852.54 | 0.89 | 310.94 |
| 2006 | 1713.55 | 0.92 | 307.53 |
| 2007 | 1974.77 | 0.90 | 210.49 |
| 2008 | 1534.67 | 0.92 | 248.84 |
| 2009 | 1237.98 | 0.93 | 215.63 |
| 2010 | 3165.46 | 0.63 | 302.44 |
| 1. Modelling and simulation | 5. Dynamics |
| 2. AI and machine learning | 6. Visualisation and visual analytics |
| 3. Breadth of application | 7. Data ethics and public engagement |
| 4. Validation and uncertainty | 8. Data platforms |
https://www.turing.ac.uk/research/research-programmes/urban-analytics
Biau, GÊrard. 2012. “Analysis of a Random Forests Model.” Journal of Machine Learning Research 13 (Apr): 1063–95.
Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32.
Caruana, Rich, Nikos Karampatziakis, and Ainur Yessenalina. 2008. “An Empirical Evaluation of Supervised Learning in High Dimensions.” In Proceedings of the 25th International Conference on Machine Learning, 96–103. ICML ’08. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1390156.1390169.
Halavais, Alexander. 2000. “National Borders on the World Wide Web.” New Media & Society 2 (1): 7–28.
Holmberg, Kim. 2010. “Co-Inlinking to a Municipal Web Space: A Webometric and Content Analysis.” Scientometrics 83 (3): 851–62.
Holmberg, Kim, and Mike Thelwall. 2009. “Local Government Web Sites in Finland: A Geographic and Webometric Analysis.” Scientometrics 79 (1): 157–69.
Janc, Krzysztof. 2015. “Geography of Hyperlinks—Spatial Dimensions of Local Government Websites.” European Planning Studies 23 (5): 1019–37.
Jones, Brant W, Ben Spigel, and Edward J Malecki. 2010. “Blog Links as Pipelines to Buzz Elsewhere: The Case of New York Theater Blogs.” Environment and Planning B: Planning and Design 37 (1): 99–111.
Keßler, Carsten. 2017. “Extracting Central Places from the Link Structure in Wikipedia.” Transactions in GIS 21 (3): 488–502.
Krüger, Miriam, Jan Kinne, David Lenz, and Bernd Resch. 2020. “The Digital Layer: How Innovative Firms Relate on the Web.” ZEW-Centre for European Economic Research Discussion Paper, nos. 20-003.
Liaw, Andy, Matthew Wiener, and others. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22.
Lin, Jia, Alexander Halavais, and Bin Zhang. 2007. “The Blog Network in America: Blogs as Indicators of Relationships Among Us Cities.” Connections 27 (2): 15–23.
Salvini, Marco M, and Sara I Fabrikant. 2016. “Spatialization of User-Generated Content to Uncover the Multirelational World City Network.” Environment and Planning B: Planning and Design 43 (1): 228–48.
Vaughan, Liwen. 2004. “Exploring Website Features for Business Information.” Scientometrics 61 (3): 467–77.
Vaughan, Liwen, Yijun Gao, and Margaret Kipp. 2006. “Why Are Hyperlinks to Business Websites Created? A Content Analysis.” Scientometrics 67 (2): 291–300.
Vaughan, Liwen, and Guozhu Wu. 2004. “Links to Commercial Websites as a Source of Business Information.” Scientometrics 60 (3): 487–96.
Yan, Xiang, Xinyu Liu, and Xilei Zhao. 2020. “Using Machine Learning for Direct Demand Modeling of Ridesourcing Services in Chicago.” Journal of Transport Geography 83: 102661.